Predict Restaurant Menu Items Profitability
Background:
In the highly competitive restaurant industry, understanding the profitability of menu items is crucial for maintaining a successful business. Profitability can be influenced by various factors such as the ingredients used, the price of the item, the restaurant category, and customer preferences. Efficiently predicting which menu items are likely to be more profitable can help restaurant managers make informed decisions about menu design, pricing strategies, and inventory management.
Objective:
The objective of this project is to develop a predictive model that can classify the profitability of restaurant menu items into categories such as Low, Medium, and High. This model will leverage historical data on menu items, including their prices, ingredients, and other relevant attributes, to make accurate profitability predictions.
Data:
The dataset consists of 1000 entries, each representing a menu item from various restaurants. The features of the dataset are as follows:
- RestaurantID: Unique identifier for the restaurant.
- MenuCategory: Category of the menu item (e.g., Appetizers, Main Course, Desserts).
- MenuItem: Name of the menu item.
- Ingredients: List of ingredients used in the menu item.
- Price: Price of the menu item.
- Profitability: Profitability category of the menu item (Low, Medium, High).
Tasks:
1. Data Exploration and Preprocessing:
- Conduct exploratory data analysis (EDA) to understand the distribution and relationships within the data.
- Handle missing values, if any, and encode categorical features appropriately.
- Engineer new features that may help in improving the model's performance, such as the number of ingredients used or specific ingredient indicators.
2. Model Development:
- Develop several machine learning models (e.g., RandomForestClassifier, DecisionTreeClassifier, XGBClassifier) to predict the profitability of menu items.
- Perform hyperparameter tuning using GridSearchCV to find the best parameters for each model.
3. Model Evaluation:
- Evaluate the performance of each model using metrics such as accuracy and F1 score.
- Compare the models and select the best-performing one based on the evaluation metrics.
4. Deep Learning Model:
- Develop a Deep Neural Network (DNN) model with multiple layers to predict profitability.
- Train and evaluate the DNN model, and compare its performance with traditional machine learning models.
5. Model Saving and Deployment:
- Save the best models (with accuracy greater than 80%) in the
modelsdirectory for future use. - Document the model training and evaluation process, and provide recommendations for deploying the model in a production environment.
Expected Outcome:
By the end of this project, we aim to have a robust predictive model that accurately classifies the profitability of restaurant menu items. This model can be used by restaurant managers to optimize their menus, set competitive prices, and ultimately enhance their profitability.
Impact:
The implementation of this predictive model will enable restaurants to make data-driven decisions, leading to improved financial performance and customer satisfaction. By understanding which menu items are more profitable, restaurants can focus on promoting and improving these items, thereby maximizing their overall profitability.
Acknowledgements:
We would like to extend our gratitude to the following resources and tools, which made this project possible:
- TensorFlow: For providing a robust framework for developing and training deep learning models.
- Scikit-learn (sklearn): For offering comprehensive tools for data analysis, preprocessing, and machine learning.
- Keras: For its easy-to-use API that facilitated the creation and training of our deep neural network models.
- Kaggle: For providing the dataset that was crucial for training and evaluating our models.
Their contributions have been invaluable in achieving the objectives of this project.
Importing Libraries
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, f1_score
from xgboost import XGBClassifier
import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, LearningRateScheduler
import pickle
Load and EDA
dataset = pd.read_csv("./Data/restaurant_menu_optimization_data.csv")
dataset.shape
dataset.head(10)
0 R003 Beverages Soda
1 R001 Appetizers Spinach Artichoke Dip
2 R003 Desserts New York Cheesecake
3 R003 Main Course Chicken Alfredo
4 R002 Main Course Grilled Steak
5 R001 Appetizers Stuffed Mushrooms
6 R001 Beverages Soda
7 R003 Desserts Tiramisu
8 R003 Main Course Grilled Steak
9 R003 Beverages Lemonade
Ingredients Price Profitability
0 ['confidential'] 2.55 Low
1 ['Tomatoes', 'Basil', 'Garlic', 'Olive Oil'] 11.12 Medium
2 ['Chocolate', 'Butter', 'Sugar', 'Eggs'] 18.66 High
3 ['Chicken', 'Fettuccine', 'Alfredo Sauce', 'Pa... 29.55 High
4 ['Chicken', 'Fettuccine', 'Alfredo Sauce', 'Pa... 17.73 Medium
5 ['Tomatoes', 'Basil', 'Garlic', 'Olive Oil'] 12.28 High
6 ['confidential'] 2.87 Low
7 ['Chocolate', 'Butter', 'Sugar', 'Eggs'] 10.47 Medium
8 ['Chicken', 'Fettuccine', 'Alfredo Sauce', 'Pa... 26.78 High
9 ['confidential'] 4.95 Medium
dataset.info()
RangeIndex: 1000 entries, 0 to 999
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 RestaurantID 1000 non-null object
1 MenuCategory 1000 non-null object
2 MenuItem 1000 non-null object
3 Ingredients 1000 non-null object
4 Price 1000 non-null float64
5 Profitability 1000 non-null object
dtypes: float64(1), object(5)
memory usage: 47.0+ KB
dataset.isnull().sum()
MenuCategory 0
MenuItem 0
Ingredients 0
Price 0
Profitability 0
dtype: int64
# Analyze the distribution of the target variable (Profitability).
sns.countplot(x='Profitability', data=dataset)
plt.show()
dataset['Profitability'].value_counts()
Medium 495
High 386
Low 119
Name: count, dtype: int64
for col in ['MenuCategory', 'MenuItem']:
print(f"The Value Counts for {col} : \n{dataset[col].value_counts(())}\n")
print(f"-"*75)
sns.countplot(x=col, data=dataset)
plt.show()
MenuCategory
Beverages 264
Desserts 256
Appetizers 254
Main Course 226
Name: count, dtype: int64
---------------------------------------------------------------------------
The Value Counts for MenuItem :
MenuItem
Iced Tea 72
New York Cheesecake 71
Tiramisu 70
Soda 69
Caprese Salad 67
Coffee 66
Vegetable Stir-Fry 66
Spinach Artichoke Dip 64
Bruschetta 64
Fruit Tart 60
Stuffed Mushrooms 59
Lemonade 57
Grilled Steak 55
Chocolate Lava Cake 55
Shrimp Scampi 55
Chicken Alfredo 50
Name: count, dtype: int64
---------------------------------------------------------------------------
sns.histplot(dataset['Price'], kde=True)
plt.show()
cat_cols = [x for x in dataset.columns if dataset[x].dtypes == 'object']
cat_cols.remove('Profitability')
cat_cols
for col in cat_cols:
contgnc_tbl = pd.crosstab(dataset[col], dataset['Profitability'])
contgnc_tbl = contgnc_tbl.loc[:, ['Low', 'Medium', 'High']]
plt.figure(figsize=(10, 6))
plt.title(col)
sns.heatmap(contgnc_tbl, annot=True, cmap='coolwarm', fmt='d')
plt.show()
plt.figure(figsize=(10, 8))
plt.title("Price & Profitability")
sns.boxplot(data=dataset, x='Profitability', y='Price')
plt.show()
plt.figure(figsize=(10, 8))
plt.title("Price & Profitability")
sns.boxplot(data=dataset, x='MenuCategory', y='Price')
plt.show()
plt.figure(figsize=(10, 8))
plt.title("Price & Profitability")
sns.boxplot(data=dataset, x='RestaurantID', y='Price')
plt.show()
plt.figure(figsize=(10, 8))
plt.title("Price & Profitability")
sns.boxplot(data=dataset, x='MenuItem', y='Price')
plt.show()
Q1 = dataset['Price'].quantile(0.25)
Q3 = dataset['Price'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 *IQR
upper_bound = Q3 + 1.5*IQR
outliers = dataset[(dataset['Price'] < lower_bound) | (dataset['Price'] > upper_bound)]
print(outliers)
Columns: [RestaurantID, MenuCategory, MenuItem, Ingredients, Price, Profitability]
Index: []
Feature Engineering
# Splitting number of Ingredients
dataset['NumIngredientsUsed'] = dataset['Ingredients'].apply(lambda x: len(x.split(',')))
dataset.head()
0 R003 Beverages Soda
1 R001 Appetizers Spinach Artichoke Dip
2 R003 Desserts New York Cheesecake
3 R003 Main Course Chicken Alfredo
4 R002 Main Course Grilled Steak
Ingredients Price Profitability \
0 ['confidential'] 2.55 Low
1 ['Tomatoes', 'Basil', 'Garlic', 'Olive Oil'] 11.12 Medium
2 ['Chocolate', 'Butter', 'Sugar', 'Eggs'] 18.66 High
3 ['Chicken', 'Fettuccine', 'Alfredo Sauce', 'Pa... 29.55 High
4 ['Chicken', 'Fettuccine', 'Alfredo Sauce', 'Pa... 17.73 Medium
NumIngredientsUsed
0 1
1 4
2 4
3 4
4 4
# Encoding categorical cols
LE = LabelEncoder()
# Label Encoding for Target col
dataset['Profitability'] = LE.fit_transform(dataset['Profitability'])
# One-Hot Encoding for other cat cols
dataset = pd.get_dummies(dataset, columns=['RestaurantID', 'MenuCategory', 'MenuItem', 'Ingredients'])
dataset.shape
dataset.head()
0 2.55 1 1 False
1 11.12 2 4 True
2 18.66 0 4 False
3 29.55 0 4 False
4 17.73 2 4 False
RestaurantID_R002 RestaurantID_R003 MenuCategory_Appetizers \
0 False True False
1 False False True
2 False True False
3 False True False
4 True False False
MenuCategory_Beverages MenuCategory_Desserts MenuCategory_Main Course \
0 True False False
1 False False False
2 False True False
3 False False True
4 False False True
... MenuItem_Shrimp Scampi MenuItem_Soda MenuItem_Spinach Artichoke Dip \
0 ... False True False
1 ... False False True
2 ... False False False
3 ... False False False
4 ... False False False
MenuItem_Stuffed Mushrooms MenuItem_Tiramisu MenuItem_Vegetable Stir-Fry \
0 False False False
1 False False False
2 False False False
3 False False False
4 False False False
Ingredients_['Chicken', 'Fettuccine', 'Alfredo Sauce', 'Parmesan'] \
0 False
1 False
2 False
3 True
4 True
Ingredients_['Chocolate', 'Butter', 'Sugar', 'Eggs'] \
0 False
1 False
2 True
3 False
4 False
Ingredients_['Tomatoes', 'Basil', 'Garlic', 'Olive Oil'] \
0 False
1 True
2 False
3 False
4 False
Ingredients_['confidential']
0 True
1 False
2 False
3 False
4 False
[5 rows x 30 columns]
dataset.columns = [col.replace('[', '').replace(']', '').replace("'", '').replace(" ", "_") for col in dataset.columns]
dataset.head()
0 2.55 1 1 False
1 11.12 2 4 True
2 18.66 0 4 False
3 29.55 0 4 False
4 17.73 2 4 False
RestaurantID_R002 RestaurantID_R003 MenuCategory_Appetizers \
0 False True False
1 False False True
2 False True False
3 False True False
4 True False False
MenuCategory_Beverages MenuCategory_Desserts MenuCategory_Main_Course \
0 True False False
1 False False False
2 False True False
3 False False True
4 False False True
... MenuItem_Shrimp_Scampi MenuItem_Soda MenuItem_Spinach_Artichoke_Dip \
0 ... False True False
1 ... False False True
2 ... False False False
3 ... False False False
4 ... False False False
MenuItem_Stuffed_Mushrooms MenuItem_Tiramisu MenuItem_Vegetable_Stir-Fry \
0 False False False
1 False False False
2 False False False
3 False False False
4 False False False
Ingredients_Chicken,_Fettuccine,_Alfredo_Sauce,_Parmesan \
0 False
1 False
2 False
3 True
4 True
Ingredients_Chocolate,_Butter,_Sugar,_Eggs \
0 False
1 False
2 True
3 False
4 False
Ingredients_Tomatoes,_Basil,_Garlic,_Olive_Oil Ingredients_confidential
0 False True
1 True False
2 False False
3 False False
4 False False
[5 rows x 30 columns]
bool_feat = []
for col in dataset.columns:
if dataset[col].dtypes == 'bool':
bool_feat.append(col)
bool_feat
'RestaurantID_R002',
'RestaurantID_R003',
'MenuCategory_Appetizers',
'MenuCategory_Beverages',
'MenuCategory_Desserts',
'MenuCategory_Main_Course',
'MenuItem_Bruschetta',
'MenuItem_Caprese_Salad',
'MenuItem_Chicken_Alfredo',
'MenuItem_Chocolate_Lava_Cake',
'MenuItem_Coffee',
'MenuItem_Fruit_Tart',
'MenuItem_Grilled_Steak',
'MenuItem_Iced_Tea',
'MenuItem_Lemonade',
'MenuItem_New_York_Cheesecake',
'MenuItem_Shrimp_Scampi',
'MenuItem_Soda',
'MenuItem_Spinach_Artichoke_Dip',
'MenuItem_Stuffed_Mushrooms',
'MenuItem_Tiramisu',
'MenuItem_Vegetable_Stir-Fry',
'Ingredients_Chicken,_Fettuccine,_Alfredo_Sauce,_Parmesan',
'Ingredients_Chocolate,_Butter,_Sugar,_Eggs',
'Ingredients_Tomatoes,_Basil,_Garlic,_Olive_Oil',
'Ingredients_confidential']
for col in bool_feat:
dataset[col] = dataset[col].replace({True: 1, False: 0})
dataset.head()
0 2.55 1 1 0
1 11.12 2 4 1
2 18.66 0 4 0
3 29.55 0 4 0
4 17.73 2 4 0
RestaurantID_R002 RestaurantID_R003 MenuCategory_Appetizers \
0 0 1 0
1 0 0 1
2 0 1 0
3 0 1 0
4 1 0 0
MenuCategory_Beverages MenuCategory_Desserts MenuCategory_Main_Course \
0 1 0 0
1 0 0 0
2 0 1 0
3 0 0 1
4 0 0 1
... MenuItem_Shrimp_Scampi MenuItem_Soda MenuItem_Spinach_Artichoke_Dip \
0 ... 0 1 0
1 ... 0 0 1
2 ... 0 0 0
3 ... 0 0 0
4 ... 0 0 0
MenuItem_Stuffed_Mushrooms MenuItem_Tiramisu MenuItem_Vegetable_Stir-Fry \
0 0 0 0
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
Ingredients_Chicken,_Fettuccine,_Alfredo_Sauce,_Parmesan \
0 0
1 0
2 0
3 1
4 1
Ingredients_Chocolate,_Butter,_Sugar,_Eggs \
0 0
1 0
2 1
3 0
4 0
Ingredients_Tomatoes,_Basil,_Garlic,_Olive_Oil Ingredients_confidential
0 0 1
1 1 0
2 0 0
3 0 0
4 0 0
[5 rows x 30 columns]
X = dataset.drop('Profitability', axis=1)
y = dataset['Profitability']
print(f"The shape of X : {X.shape}")
print(f"The shape of y : {y.shape}")
The shape of y : (1000,)
Build and Evaluate Model
trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.2, random_state=42)
print(f"The shape of the trainX : {trainX.shape}")
print(f"The shape of the testX : {testX.shape}")
print(f"The shape of the trainY : {trainY.shape}")
print(f"The shape of the testY : {testY.shape}")
The shape of the testX : (200, 29)
The shape of the trainY : (800,)
The shape of the testY : (200,)
# Function to evaluate model
def evaluate_model(trueY, predY):
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report
# Compute accuracy
accuracy = accuracy_score(trueY, predY)
# Compute precision
precision = precision_score(trueY, predY, average='weighted')
# Compute recall
recall = recall_score(trueY, predY, average='weighted')
# Compute F1 score
f1 = f1_score(trueY, predY, average='weighted')
# Compute confusion matrix
conf_matrix = confusion_matrix(trueY, predY)
# Generate classification report
class_report = classification_report(trueY, predY)
# Print the metrics
print("Model Evaluation Metrics:")
print("-" * 30)
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
print("\nConfusion Matrix:")
print(conf_matrix)
print("\nClassification Report:")
print(class_report)
# Return the metrics as a dictionary
metrics = {
'accuracy': accuracy,
'precision': precision,
'recall': recall,
'f1_score': f1,
'confusion_matrix': conf_matrix,
'classification_report': class_report
}
return metrics
# Function to fit model using GridSearchCV
def fitModel(trainX, testX, trainY, testY, model_name, model_algo, params, CV):
"""
Fits a machine learning model using GridSearchCV, refits it with the best parameters,
evaluates it on both training and test data, and stores the evaluation metrics.
Parameters:
- trainX: Training features.
- testX: Test features.
- trainY: Training labels.
- testY: Test labels.
- model_name: Name of the model (string).
- model_algo: Machine learning algorithm (estimator).
- params: Dictionary of hyperparameters to tune.
- CV: Number of cross-validation folds.
Returns:
- best_model: The model fitted with the best parameters.
- best_params: Best hyperparameters found by GridSearchCV.
- test_metrics: Evaluation metrics on the test set.
"""
np.random.seed(10)
print(f"{'-'*75}")
print("Information")
print(f"{'-'*75}")
print(f"Fitting for Model: {model_name}")
grid = GridSearchCV(
estimator=model_algo,
param_grid=params,
scoring='accuracy',
n_jobs=-1,
cv=CV,
verbose=1
)
res = grid.fit(trainX, trainY)
print("Model fitting completed.")
print(f"{'-'*75}")
best_params = res.best_params_
print(f"Found best parameters for model {model_name}: {best_params}")
print(f"{'-'*75}")
# Refit the model with the best parameters
model_algo.set_params(**best_params)
model_algo.fit(trainX, trainY)
print(f"Completed refitting the model {model_name} with best parameters.")
print(f"{'-'*75}")
# Evaluate on the training data
print(f"Evaluating {model_name} on the training data.")
trainY_pred = model_algo.predict(trainX)
print(f"Evaluation metrics for {model_name} on the training data:")
train_metrics = evaluate_model(trainY, trainY_pred)
print(f"{'-'*75}")
# Evaluate on the test data
print(f"Evaluating {model_name} on the test data.")
testY_pred = model_algo.predict(testX)
print(f"Evaluation metrics for {model_name} on the test data:")
test_metrics = evaluate_model(testY, testY_pred)
print(f"{'-'*75}")
test_accuracy = test_metrics['accuracy']
if test_accuracy > 0.80:
model_filename = f"./Models/{model_name}.pkl"
with open(model_filename, 'wb') as f:
pickle.dump(model_algo, f)
print(f"Model {model_name} saved as '{model_name}.pkl'")
return model_algo, best_params, test_metrics
# Define models and their hyperparameters
models = {
'RandomForest': {
'model': RandomForestClassifier(),
'params': {
'n_estimators': [10, 50, 100, 500],
'max_depth': [None, 10, 20, 30]
}
},
'DecisionTree': {
'model': DecisionTreeClassifier(),
'params': {
'max_depth': [None, 10, 20, 30],
'min_samples_split': [2, 10, 20]
}
},
'XGBoost': {
'model': XGBClassifier(),
'params': {
'n_estimators': [10, 50, 100],
'max_depth': [3, 6, 9],
'learning_rate': [0.01, 0.1, 0.2]
}
}
}
model_results = {}
for model_name, model_info in models.items():
model_algo = model_info['model']
params = model_info['params']
best_model, best_params, test_metrics = fitModel(trainX,testX,
trainY,testY,
model_name, model_algo,
params, CV=5)
model_results[model_name] = {
'best_model': best_model,
'best_params': best_params,
'test_metrics': test_metrics
}
for model_name, result in model_results.items():
print(f"\nModel : {model_name}")
print(f"Best Parameters: : {result['best_params']}")
print(f"Test Metrics : {result['test_metrics']}")
Information
---------------------------------------------------------------------------
Fitting for Model: RandomForest
Fitting 5 folds for each of 16 candidates, totalling 80 fits
Model fitting completed.
---------------------------------------------------------------------------
Found best parameters for model RandomForest: {'max_depth': 10, 'n_estimators': 50}
---------------------------------------------------------------------------
Completed refitting the model RandomForest with best parameters.
---------------------------------------------------------------------------
Evaluating RandomForest on the training data.
Evaluation metrics for RandomForest on the training data:
Model Evaluation Metrics:
------------------------------
Accuracy: 0.9675
Precision: 0.9682
Recall: 0.9675
F1 Score: 0.9671
Confusion Matrix:
[[302 0 7]
[ 11 80 3]
[ 5 0 392]]
Classification Report:
precision recall f1-score support
0 0.95 0.98 0.96 309
1 1.00 0.85 0.92 94
2 0.98 0.99 0.98 397
accuracy 0.97 800
macro avg 0.97 0.94 0.95 800
weighted avg 0.97 0.97 0.97 800
---------------------------------------------------------------------------
Evaluating RandomForest on the test data.
Evaluation metrics for RandomForest on the test data:
Model Evaluation Metrics:
------------------------------
Accuracy: 0.8800
Precision: 0.8750
Recall: 0.8800
F1 Score: 0.8742
Confusion Matrix:
[[71 3 3]
[ 5 13 7]
[ 5 1 92]]
Classification Report:
precision recall f1-score support
0 0.88 0.92 0.90 77
1 0.76 0.52 0.62 25
2 0.90 0.94 0.92 98
accuracy 0.88 200
macro avg 0.85 0.79 0.81 200
weighted avg 0.88 0.88 0.87 200
---------------------------------------------------------------------------
Model RandomForest saved as 'RandomForest.pkl'
---------------------------------------------------------------------------
Information
---------------------------------------------------------------------------
Fitting for Model: DecisionTree
Fitting 5 folds for each of 12 candidates, totalling 60 fits
Model fitting completed.
---------------------------------------------------------------------------
Found best parameters for model DecisionTree: {'max_depth': None, 'min_samples_split': 20}
---------------------------------------------------------------------------
Completed refitting the model DecisionTree with best parameters.
---------------------------------------------------------------------------
Evaluating DecisionTree on the training data.
Evaluation metrics for DecisionTree on the training data:
Model Evaluation Metrics:
------------------------------
Accuracy: 0.9363
Precision: 0.9363
Recall: 0.9363
F1 Score: 0.9352
Confusion Matrix:
[[295 2 12]
[ 13 72 9]
[ 12 3 382]]
Classification Report:
precision recall f1-score support
0 0.92 0.95 0.94 309
1 0.94 0.77 0.84 94
2 0.95 0.96 0.95 397
accuracy 0.94 800
macro avg 0.93 0.89 0.91 800
weighted avg 0.94 0.94 0.94 800
---------------------------------------------------------------------------
Evaluating DecisionTree on the test data.
Evaluation metrics for DecisionTree on the test data:
Model Evaluation Metrics:
------------------------------
Accuracy: 0.8800
Precision: 0.8743
Recall: 0.8800
F1 Score: 0.8744
Confusion Matrix:
[[69 3 5]
[ 5 13 7]
[ 2 2 94]]
Classification Report:
precision recall f1-score support
0 0.91 0.90 0.90 77
1 0.72 0.52 0.60 25
2 0.89 0.96 0.92 98
accuracy 0.88 200
macro avg 0.84 0.79 0.81 200
weighted avg 0.87 0.88 0.87 200
---------------------------------------------------------------------------
Model DecisionTree saved as 'DecisionTree.pkl'
---------------------------------------------------------------------------
Information
---------------------------------------------------------------------------
Fitting for Model: XGBoost
Fitting 5 folds for each of 27 candidates, totalling 135 fits
Model fitting completed.
---------------------------------------------------------------------------
Found best parameters for model XGBoost: {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 100}
---------------------------------------------------------------------------
Completed refitting the model XGBoost with best parameters.
---------------------------------------------------------------------------
Evaluating XGBoost on the training data.
Evaluation metrics for XGBoost on the training data:
Model Evaluation Metrics:
------------------------------
Accuracy: 0.9250
Precision: 0.9248
Recall: 0.9250
F1 Score: 0.9232
Confusion Matrix:
[[291 3 15]
[ 15 67 12]
[ 12 3 382]]
Classification Report:
precision recall f1-score support
0 0.92 0.94 0.93 309
1 0.92 0.71 0.80 94
2 0.93 0.96 0.95 397
accuracy 0.93 800
macro avg 0.92 0.87 0.89 800
weighted avg 0.92 0.93 0.92 800
---------------------------------------------------------------------------
Evaluating XGBoost on the test data.
Evaluation metrics for XGBoost on the test data:
Model Evaluation Metrics:
------------------------------
Accuracy: 0.9000
Precision: 0.8983
Recall: 0.9000
F1 Score: 0.8931
Confusion Matrix:
[[73 2 2]
[ 5 13 7]
[ 4 0 94]]
Classification Report:
precision recall f1-score support
0 0.89 0.95 0.92 77
1 0.87 0.52 0.65 25
2 0.91 0.96 0.94 98
accuracy 0.90 200
macro avg 0.89 0.81 0.83 200
weighted avg 0.90 0.90 0.89 200
---------------------------------------------------------------------------
Model XGBoost saved as 'XGBoost.pkl'
Model : RandomForest
Best Parameters: : {'max_depth': 10, 'n_estimators': 50}
Test Metrics : {'accuracy': 0.88, 'precision': 0.8750181554103122, 'recall': 0.88, 'f1_score': 0.8741936106088005, 'confusion_matrix': array([[71, 3, 3],
[ 5, 13, 7],
[ 5, 1, 92]], dtype=int64), 'classification_report': ' precision recall f1-score support\n\n 0 0.88 0.92 0.90 77\n 1 0.76 0.52 0.62 25\n 2 0.90 0.94 0.92 98\n\n accuracy 0.88 200\n macro avg 0.85 0.79 0.81 200\nweighted avg 0.88 0.88 0.87 200\n'}
Model : DecisionTree
Best Parameters: : {'max_depth': None, 'min_samples_split': 20}
Test Metrics : {'accuracy': 0.88, 'precision': 0.8743455533487807, 'recall': 0.88, 'f1_score': 0.874404924760602, 'confusion_matrix': array([[69, 3, 5],
[ 5, 13, 7],
[ 2, 2, 94]], dtype=int64), 'classification_report': ' precision recall f1-score support\n\n 0 0.91 0.90 0.90 77\n 1 0.72 0.52 0.60 25\n 2 0.89 0.96 0.92 98\n\n accuracy 0.88 200\n macro avg 0.84 0.79 0.81 200\nweighted avg 0.87 0.88 0.87 200\n'}
Model : XGBoost
Best Parameters: : {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 100}
Test Metrics : {'accuracy': 0.9, 'precision': 0.8982617017917752, 'recall': 0.9, 'f1_score': 0.8930804702900591, 'confusion_matrix': array([[73, 2, 2],
[ 5, 13, 7],
[ 4, 0, 94]], dtype=int64), 'classification_report': ' precision recall f1-score support\n\n 0 0.89 0.95 0.92 77\n 1 0.87 0.52 0.65 25\n 2 0.91 0.96 0.94 98\n\n accuracy 0.90 200\n macro avg 0.89 0.81 0.83 200\nweighted avg 0.90 0.90 0.89 200\n'}
def createDNNmodel(input_dim, output_dim):
np.random.seed(10)
model = Sequential()
# Hidden Layers
model = Sequential()
model.add(Dense(256, input_dim=input_dim, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu'))
# Output Layer
model.add(Dense(output_dim, activation='softmax'))
# Model Compile
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
input_dim = trainX.shape[1]
output_dim = 3
model = createDNNmodel(input_dim, output_dim)
model.summary()
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_71 (Dense) │ (None, 256) │ 7,680 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_36 │ (None, 256) │ 1,024 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_48 (Dropout) │ (None, 256) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_72 (Dense) │ (None, 128) │ 32,896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_37 │ (None, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_49 (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_73 (Dense) │ (None, 64) │ 8,256 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_38 │ (None, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_50 (Dropout) │ (None, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_74 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_51 (Dropout) │ (None, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_75 (Dense) │ (None, 32) │ 1,056 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_76 (Dense) │ (None, 3) │ 99 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
trainY_encoded = to_categorical(trainY, num_classes=3)
testY_encoded = to_categorical(testY, num_classes=3)
# Callbacks
early_stoping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
model_checkpoint = ModelCheckpoint('best_model.keras', save_best_only=True, monitor='val_loss')
# Define learning rate sheduler
def lr_shedule(epoch, lr):
if epoch > 150:
lr = lr *0.1
elif epoch > 100:
lr = lr * 0.5
return lr
lr_sheduler = LearningRateScheduler(lr_shedule)
history = model.fit(trainX, trainY_encoded,
epochs=150, batch_size=16,
validation_split=0.2,
verbose=1,
callbacks=[early_stoping, model_checkpoint, lr_sheduler])
40/40 ━━━━━━━━━━━━━━━━━━━━ 5s 13ms/step - accuracy: 0.3446 - loss: 4.6720 - val_accuracy: 0.5688 - val_loss: 3.9413 - learning_rate: 0.0010
Epoch 2/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.5480 - loss: 3.8156 - val_accuracy: 0.5688 - val_loss: 3.4791 - learning_rate: 0.0010
Epoch 3/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.6083 - loss: 3.4847 - val_accuracy: 0.6375 - val_loss: 3.2114 - learning_rate: 0.0010
Epoch 4/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6099 - loss: 3.2450 - val_accuracy: 0.7000 - val_loss: 3.0037 - learning_rate: 0.0010
Epoch 5/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.6320 - loss: 3.1307 - val_accuracy: 0.8125 - val_loss: 2.8312 - learning_rate: 0.0010
Epoch 6/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.6735 - loss: 2.9330 - val_accuracy: 0.8188 - val_loss: 2.6782 - learning_rate: 0.0010
Epoch 7/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.6581 - loss: 2.7870 - val_accuracy: 0.8250 - val_loss: 2.5353 - learning_rate: 0.0010
Epoch 8/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7025 - loss: 2.5939 - val_accuracy: 0.8562 - val_loss: 2.4172 - learning_rate: 0.0010
Epoch 9/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6919 - loss: 2.5352 - val_accuracy: 0.7812 - val_loss: 2.3262 - learning_rate: 0.0010
Epoch 10/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7189 - loss: 2.4037 - val_accuracy: 0.7875 - val_loss: 2.1960 - learning_rate: 0.0010
Epoch 11/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7516 - loss: 2.2814 - val_accuracy: 0.8500 - val_loss: 2.0575 - learning_rate: 0.0010
Epoch 12/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7367 - loss: 2.1740 - val_accuracy: 0.8000 - val_loss: 1.9858 - learning_rate: 0.0010
Epoch 13/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7480 - loss: 2.0473 - val_accuracy: 0.8000 - val_loss: 1.8526 - learning_rate: 0.0010
Epoch 14/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7228 - loss: 1.9619 - val_accuracy: 0.7812 - val_loss: 1.7878 - learning_rate: 0.0010
Epoch 15/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7478 - loss: 1.8644 - val_accuracy: 0.7625 - val_loss: 1.7408 - learning_rate: 0.0010
Epoch 16/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7630 - loss: 1.7669 - val_accuracy: 0.7375 - val_loss: 1.6675 - learning_rate: 0.0010
Epoch 17/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7831 - loss: 1.6543 - val_accuracy: 0.8000 - val_loss: 1.5290 - learning_rate: 0.0010
Epoch 18/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7865 - loss: 1.6420 - val_accuracy: 0.6812 - val_loss: 1.6884 - learning_rate: 0.0010
Epoch 19/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7577 - loss: 1.5884 - val_accuracy: 0.8500 - val_loss: 1.3618 - learning_rate: 0.0010
Epoch 20/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.8105 - loss: 1.4305 - val_accuracy: 0.8000 - val_loss: 1.3359 - learning_rate: 0.0010
Epoch 21/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8076 - loss: 1.4544 - val_accuracy: 0.8750 - val_loss: 1.2183 - learning_rate: 0.0010
Epoch 22/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7789 - loss: 1.3872 - val_accuracy: 0.8813 - val_loss: 1.1931 - learning_rate: 0.0010
Epoch 23/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7797 - loss: 1.3535 - val_accuracy: 0.8250 - val_loss: 1.2073 - learning_rate: 0.0010
Epoch 24/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8065 - loss: 1.2633 - val_accuracy: 0.8750 - val_loss: 1.0890 - learning_rate: 0.0010
Epoch 25/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8039 - loss: 1.2137 - val_accuracy: 0.8813 - val_loss: 1.0554 - learning_rate: 0.0010
Epoch 26/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8405 - loss: 1.1230 - val_accuracy: 0.8313 - val_loss: 1.0546 - learning_rate: 0.0010
Epoch 27/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8337 - loss: 1.0805 - val_accuracy: 0.8938 - val_loss: 0.9617 - learning_rate: 0.0010
Epoch 28/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7893 - loss: 1.1328 - val_accuracy: 0.9187 - val_loss: 0.8877 - learning_rate: 0.0010
Epoch 29/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7948 - loss: 1.1327 - val_accuracy: 0.8500 - val_loss: 0.8902 - learning_rate: 0.0010
Epoch 30/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8006 - loss: 1.0795 - val_accuracy: 0.8438 - val_loss: 0.9295 - learning_rate: 0.0010
Epoch 31/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7908 - loss: 1.0397 - val_accuracy: 0.9187 - val_loss: 0.8102 - learning_rate: 0.0010
Epoch 32/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8144 - loss: 0.9619 - val_accuracy: 0.9250 - val_loss: 0.7876 - learning_rate: 0.0010
Epoch 33/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8017 - loss: 0.9554 - val_accuracy: 0.8938 - val_loss: 0.7629 - learning_rate: 0.0010
Epoch 34/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8158 - loss: 0.9154 - val_accuracy: 0.8813 - val_loss: 0.7810 - learning_rate: 0.0010
Epoch 35/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8219 - loss: 0.8532 - val_accuracy: 0.9125 - val_loss: 0.7345 - learning_rate: 0.0010
Epoch 36/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8200 - loss: 0.8870 - val_accuracy: 0.8500 - val_loss: 0.7743 - learning_rate: 0.0010
Epoch 37/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8336 - loss: 0.8642 - val_accuracy: 0.9062 - val_loss: 0.7078 - learning_rate: 0.0010
Epoch 38/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8370 - loss: 0.8550 - val_accuracy: 0.8813 - val_loss: 0.7137 - learning_rate: 0.0010
Epoch 39/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8236 - loss: 0.8774 - val_accuracy: 0.9312 - val_loss: 0.6227 - learning_rate: 0.0010
Epoch 40/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8409 - loss: 0.8057 - val_accuracy: 0.8375 - val_loss: 0.7345 - learning_rate: 0.0010
Epoch 41/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8172 - loss: 0.8462 - val_accuracy: 0.9000 - val_loss: 0.6312 - learning_rate: 0.0010
Epoch 42/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8550 - loss: 0.6972 - val_accuracy: 0.9250 - val_loss: 0.5989 - learning_rate: 0.0010
Epoch 43/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8466 - loss: 0.7272 - val_accuracy: 0.8687 - val_loss: 0.6532 - learning_rate: 0.0010
Epoch 44/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8279 - loss: 0.7689 - val_accuracy: 0.9187 - val_loss: 0.5819 - learning_rate: 0.0010
Epoch 45/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8463 - loss: 0.7098 - val_accuracy: 0.9000 - val_loss: 0.6095 - learning_rate: 0.0010
Epoch 46/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8483 - loss: 0.7216 - val_accuracy: 0.9000 - val_loss: 0.6104 - learning_rate: 0.0010
Epoch 47/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8157 - loss: 0.7504 - val_accuracy: 0.9125 - val_loss: 0.5578 - learning_rate: 0.0010
Epoch 48/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7960 - loss: 0.7611 - val_accuracy: 0.9000 - val_loss: 0.5575 - learning_rate: 0.0010
Epoch 49/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8482 - loss: 0.6917 - val_accuracy: 0.8875 - val_loss: 0.5759 - learning_rate: 0.0010
Epoch 50/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8374 - loss: 0.6933 - val_accuracy: 0.9250 - val_loss: 0.5649 - learning_rate: 0.0010
Epoch 51/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8152 - loss: 0.6988 - val_accuracy: 0.9375 - val_loss: 0.5323 - learning_rate: 0.0010
Epoch 52/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8346 - loss: 0.7037 - val_accuracy: 0.9062 - val_loss: 0.5642 - learning_rate: 0.0010
Epoch 53/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8292 - loss: 0.7150 - val_accuracy: 0.9187 - val_loss: 0.5424 - learning_rate: 0.0010
Epoch 54/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8102 - loss: 0.6664 - val_accuracy: 0.8813 - val_loss: 0.5871 - learning_rate: 0.0010
Epoch 55/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8086 - loss: 0.7386 - val_accuracy: 0.8750 - val_loss: 0.5951 - learning_rate: 0.0010
Epoch 56/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8224 - loss: 0.7079 - val_accuracy: 0.9312 - val_loss: 0.5334 - learning_rate: 0.0010
Epoch 57/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8264 - loss: 0.7271 - val_accuracy: 0.8438 - val_loss: 0.5825 - learning_rate: 0.0010
Epoch 58/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8321 - loss: 0.6500 - val_accuracy: 0.9250 - val_loss: 0.5044 - learning_rate: 0.0010
Epoch 59/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8299 - loss: 0.6710 - val_accuracy: 0.8438 - val_loss: 0.6246 - learning_rate: 0.0010
Epoch 60/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7991 - loss: 0.7364 - val_accuracy: 0.8687 - val_loss: 0.5611 - learning_rate: 0.0010
Epoch 61/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8675 - loss: 0.5900 - val_accuracy: 0.8750 - val_loss: 0.5523 - learning_rate: 0.0010
Epoch 62/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8084 - loss: 0.6906 - val_accuracy: 0.8250 - val_loss: 0.6866 - learning_rate: 0.0010
Epoch 63/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8550 - loss: 0.6350 - val_accuracy: 0.8687 - val_loss: 0.5200 - learning_rate: 0.0010
Epoch 64/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8256 - loss: 0.6992 - val_accuracy: 0.8938 - val_loss: 0.5004 - learning_rate: 0.0010
Epoch 65/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8430 - loss: 0.6572 - val_accuracy: 0.8813 - val_loss: 0.5192 - learning_rate: 0.0010
Epoch 66/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8436 - loss: 0.6503 - val_accuracy: 0.9375 - val_loss: 0.4798 - learning_rate: 0.0010
Epoch 67/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8402 - loss: 0.6225 - val_accuracy: 0.9250 - val_loss: 0.4446 - learning_rate: 0.0010
Epoch 68/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8547 - loss: 0.6173 - val_accuracy: 0.9438 - val_loss: 0.4594 - learning_rate: 0.0010
Epoch 69/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8148 - loss: 0.6573 - val_accuracy: 0.9312 - val_loss: 0.4613 - learning_rate: 0.0010
Epoch 70/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8633 - loss: 0.5786 - val_accuracy: 0.9187 - val_loss: 0.4845 - learning_rate: 0.0010
Epoch 71/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8352 - loss: 0.6395 - val_accuracy: 0.8875 - val_loss: 0.5097 - learning_rate: 0.0010
Epoch 72/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8219 - loss: 0.6760 - val_accuracy: 0.8625 - val_loss: 0.5392 - learning_rate: 0.0010
Epoch 73/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8657 - loss: 0.6049 - val_accuracy: 0.9125 - val_loss: 0.5018 - learning_rate: 0.0010
Epoch 74/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8293 - loss: 0.6570 - val_accuracy: 0.9000 - val_loss: 0.4736 - learning_rate: 0.0010
Epoch 75/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7988 - loss: 0.7061 - val_accuracy: 0.9250 - val_loss: 0.4474 - learning_rate: 0.0010
Epoch 76/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8372 - loss: 0.6510 - val_accuracy: 0.8813 - val_loss: 0.4987 - learning_rate: 0.0010
Epoch 77/150
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7988 - loss: 0.7003 - val_accuracy: 0.9375 - val_loss: 0.4755 - learning_rate: 0.0010
test_loss, test_accuracy = model.evaluate(testX, testY_encoded, verbose=1)
print(f"Expanded Model Test Loss: {test_loss*100:.2f}%")
print(f"Expanded Model Test Accuracy: {test_accuracy*100:.2f}%")
Expanded Model Test Loss: 59.62%
Expanded Model Test Accuracy: 88.00%
if test_accuracy > 0.80:
dnn_modelName = './Models/DNN_model.pkl'
with open (dnn_modelName, 'wb') as f:
pickle.dump(model, f)
print(f"DNN model saved as DNN_model.pkl")
# Plot training & val accuracy values for the expanded model
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title("Expanded Model Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.legend(["Train", "Validation"], loc='upper left')
plt.show()
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title("Expanded Model Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend(["Train", "Validation"], loc="upper left")
plt.show()